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1 Introduction 

Distributional analysis has been a widely used technique in the study of 
social choice in Euclidean models [28,29,1,3,8,15,5,2,23] (see also [4] and 
[19, Chaps. 11-12]) for more than two decades. In distributional analysis, a 
continuum or infinite population of voters is analyzed, where the population 
follows some probability distribution p.. 

Infinite populations do not exist. Therefore, the principal purpose of 
distributional analysis must be to give insight into the behavior of large 
but finite populations. 

In this paper it is shown that distributional analysis is flawed when 
applied to this end. The problem is essentially one of convergence: if the 
limiting case is to give insight into the large finite case, behavior of the latter 
should converge to behavior of the former as the population grows. Unfor- 
tunately, it turns out that properties of finite populations do not in general 
converge to the properties of infinite populations. In some cases a distri- 
butional analysis will predict that a point is in the core with probability 1, 
while the true probability converges to 0. Thus analysis of infinite popu- 
lations may fail to yield any information about finite populations, however 
large. 

An alternative technique 

An alternative probabilistic technique for the study of social choice is 
termed here the finite sample method. In this method, n points are in- 
dependently sampled from the distribution p. This random finite sample 
from p forms a configuration of n points whose properties are analyzed. 
A typical question would be: “what is the probability, a s a function of n, 
that the configuration generated has nonempty core?” Typical answers to 
these questions are bounds or asymptotically close estimates for the desired 
probability. 

It is sometimes possible to combine distributional analysis with finite 
sample analysis to make correct predictions about the asymptotic behavior 
of large populations. An example of this is found in [2]. We expose some 
key properties which enable the convergence in this case, enabling a simpler 
and more general proof of the convergence of min-max majority rule. We 
also estimate the population size for which the results are meaningful, ».e., 



1 



at which convergence begins to take hold. For committee sizes of 10,000 
or more, a 2/3 majority rule is likely to be stable, under the concavity 
assumptions of [2]. For committee sizes of 250 or less, there is some doubt 
as to whether 2/3 majority rule is necessarily stable. 

Following a suggestion due to Robert Foley, Richard McKelvey, and 
Gideon Weiss, we explore the use of uniform convergence theorems to trans- 
form distributional results into finite sample results. Theorems about the 
uniform convergence of empirical measures [18, e.g.] yield a simpler and 
more general proof of Simpson- Kramer min-max convergence[2] and a sim- 
pler though less general proof of yolk shrinkage [26]. The analysis suggests 
a rule of thumb as to when one might expect distributional analysis to give 
accurate or inaccurate predictions about the behavior of finite populations. 

A careful reading of Tullock’s original paper [28] reveals a clear insightful 
distinction between the distributional and finite sample methods, and a 
remarkable foreshadowing of some of the outcomes of finite sample analysis. 

Empirical study of social choice 

Another motivation for analyzing the distributional method, besides the 
clarification of results in the literature, is to help uncover a rigorous foun- 
dation for statistical empirical study of group choice. One would like to poll 
the members of a committee, assembly, or population (or in some other way 
extract data on their positions on the issues), and based on that data and 
some solution concept, make a prediction with some confidence regarding 
what the outcome will be. How do we experimentally test a solution con- 
cept? Ignoring the difficulties of data acquisition (e.g. sincerity), and any 
computational issues, there is still a problem regarding the stability of the 
solution concept with respect to individual perturbations. In other words, 
a person’s views on issues are not perfectly constant; one can even change 
one’s mind in the voting booth. How can we know that a prediction based 
on polls taken one day is apt to be close to the actual results the next day? 

We may think of each person’s views as having a probability distri- 
bution. When we interview a person we get a random sample from this 
distribution. When that person votes or negotiates in committee, it is on 
the basis of another random sample from this distribution. The problem 
is to establish rigorously the stability of a solution concept under these 
conditions. 

In statistical terms, the finite sample from p yields an empirical measure 
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H n . A solution concept is a statistic, a function / operating on probability 
measures. If / is a consistent statistic, then the limiting behavior of f{nn) 
will (almost surely) be like f(n) and the solution concept is stable. 

This issue has received a great deal of attention for the classical core 
or Nash equilibrium under the term “structural stability”. The outcome 
is negative: the Nash solution concept is not usually applicable, and is 
never structurally stable in three or more dimensions [20]. In section 6 we 
illustrate how theorems for the uniform convergence of empirical measures 
[18, e.g.] can be invoked to establish the stability of other more widely 
applicable concepts. 

The outline of the paper follows: the remainder of this section reviews 
essential definitions of the spatial model. Section 2 introduces the two 
methods by way of a small example. Section 3 analyzes the distributional 
method. Section 4 demonstrates in greater detail a case from [l] where the 
distributional method gives a misleading result. Section 5 discusses a case 
(the 64%-rule of Caplin and Nalebuf [2]) where the finite sample method 
may be combined with the distributional method to achieve results valid 
for large finite populations. Large is argued to be somewhere between 250 
and 10,000 in this case. Section 6 introduces the use of uniform conver- 
gence of empirical measures and discusses in general when one may expect 
the distributional method to be useful and when we may expect it to be 
misleading. Section 7 concludes by re-examining Tullock’s original paper 
[28]. 



1.1 Definition of the spatial model 

In the Euclidean spatial model, a social choice involving m issues is to be 
made. The possible proposals axe represented as vectors in Sft m . Each in- 
dividual i has a most preferred point £,■ € 3J m . This point will be referred 
to as a voter point, or simply a voter. Under Euclidean preferences, an in- 
dividual faced with two alternatives will select the one closest to her most 
preferred point, under the Euclidean norm. This model is more general 
than it appears: Davis tt al. [3] show it is equivalent to any linearly trans- 
formed spatial model which maintains the properties of an inner product; 
Grandmont [8] (see also [2, section 5]) observes that the essential property 
of the Euclidean model is often the “division-by-hyperplane” property (in 
the Euclidean case, the perpendicular bisector of two points separates those 
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who prefer one point to the other), and so results in the Euclidean model 
usually apply to the more general class of “intermediate preferences”, in- 
cluding constant elasticity of substitution (C.E.S.) utility functions (these 
extend the class of Davis et al. by allowing a change to an IP norm from 
the L 2 norm). 

2 Two methods and an example 



Let us begin with a simple two-dimensional model based on an example in 
[23]. Let p be a probability distribution that is uniform on a circle (the 
circumference of a disk). Place a single voter Vi at the center of the circle, 
which for convenience we locate at the origin. Randomly generate n — 1 
additional voter points v 2 , ... ,v n , where n is even, according to p. 

We introduce some terminology. A particular realization of this random 
process is a configuration, a specific set of points V = {vi,...,v„} with 
specific locations in Di m . In this case V is a finite configuration. If |V| is 
infinite V is an infinite configuration. 

the finite sample method 

Next we illustrate the method of finite sample analysis on the model 
just stated. The question we pose is: what is the probability that Vi is 
undominated in the configuration V ? A result of Schofield’s [23] implies 
that the probability is positive, but the exact probability was not known 
until recently: v is undominated with probability 1/2” -2 . Notice that the 
answer to the question is parameterized by n, as one would expect. The 
proof is sketched here since it will be needed in Sections 4 and 5. 

Theorem 1 Place Vi at the origin and generate v 2 ,...,v n independently 
according to any nondegenerate sign-invariant distribution p. Then for all 
even n, the probability tq is undominated is 1/2” -2 . 

Proof:([25[) Associate for each 6, 0 < 6 < jr, a line passing through the 
origin and an associated orientation. See Figure 1. Denote this line by 
L(6). The open half space the line is oriented towards is the “front” and 
the other open half space is the “back” of the line L{6). 

Since the points axe drawn from a nondegenerate distribution, the prob- 
ability is 0 that any pair of the points v 2 , . . . , v n are collinear with the origin. 
Henceforth we assume this event does not occur. 
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For any 0 < 6 < ir define the gap function g(6) to equal the number 
of voter points in the front half plane of L{8) minus the number of voter 
points in the back of L(0). If g{9) = —1 or 1, the line L(6) divides the n — 1 
points V2,...,v„ as equally as possible given that n is even. If however 
the gap function g(6) ever attains |y(0)| > 3 then one side of the line will 
contain at least 1 + n/2 points and tq will not be a core point. 

Starting at 6 = 0, increase 6 continuously to 7r. Because no two points 
are collinear with iq, g{6) will change by either +2 or -2 as the line L(6) 
crosses over a point v,. Let denote the values of 6 at which L(6) 

crosses over a voter and let X,- = +2 or —2 accordingly as the ith crossover 
increases or decreases g(9). The key observation is that the gap function 
executes a random walk as 6 goes from 0 to tt. 

Lemma 1. Xi,...,X n _ x are independent identically distributed variables 
taking values 2 with probability l/2 and —2 with probability 1/2. 

Proof of Lemma: the proof follows easily from the sign-invariance of f. i . See 
Figure 2: the regions I and II are equally likely to contain the next point 
as we sweep L(6 ) around. Details are given in [25]. 

There are 2 n_1 possible paths for the random walk of the X,- to take. 
Of these, only two will keep the gap function at | g(6) < l|. These are the 
alternating paths X,- = 2(— l)’ and X,- = — 2(— 1)‘. Any nonalternating 
^-s«querree sequence must contain two consecutive +2’s or two consecutive - 
2’s. If this ever happens, the gap function will change by 4 and so must leave 
the range {—1, l}. By Lemma 1, each of the 2 n_1 possible paths occurs with 
equal probability l/2 n-1 . Therefore the probability that max* |y(0)| < 1 is 
2/2 n-1 = 1/2" -2 as desired. This completes the proof of Theorem 1. 

A few remarks about Theorem 1: the proof only assumes n is sign- 
invariant: /z(y) = so it applies to the uniform rectangle distribution 

in [29, l] and many others. For the model under discussion, Theorem 1 
gives a stronger outcome, for obviously (w.p.l) no other point in 9? 2 can 
be undominated. Thus the configuration V has nonempty core with exact 
probability 1/2" -2 . 

the distributional method 

Now let us illustrate the distributional method on the same model. (The 
following closely follows analyses in [29,23,3,8]). Assume a continuum of 
voters uniformly distributed on the circle. Every line passing through 0 has 
mass < 1/2 on either side of it. That is, each halfspace h defined by a line 
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through 0 has /i(h) <1/2. Thus v is undominated. In fact by [3, theorem 
l] or [15, theorem 2] it is the unique “dominant” or undominated point. 
(The reader who is concerned about the “extra” point at 0 may observe 
that this only improves the position of 0 with respect to equilibrium.) 

The contrast between the two methods is evident. The finite sample 
method shows that the probability of 0 being undominated, indeed of a 
nonempty core, rapidly converges to 0. The distributional method says 
that for an infinite population, the probability of 0 being undominated is 
1 . 

The example of this section reveals that there is a flaw in the distribu- 
tional method. It would be desirable for the outcome of the distributional 
method to coincide with the limiting behavior of finite samples, since the 
goal must be insight into the behavior of finite populations. Yet there 
could hardly be less consonance than in the example just given. In the 
next section we analyze the distributional method to explain how this flaw 
arises. 



3 An analysis of the distributional method 

A contrast with distributional analysis 

We have observed that the outcomes of the two methods can differ. 
Let us point up an important distinction in how they operate. The dis- 
tributional method works directly with /x, and quantities such as n(h) are 
considered. On the other hand, in the finite sample method a configura- 
tion V is drawn from /x, and quantities such as \V n h| are considered. 
Informally, the distributional method counts up voters by looking at the 
distribution function \i directly, while the finite sample method counts up 
voters by looking at configurations drawn from /x. 

More formally, the distribution function /x analyzed in finite sample 
analysis is not am infinite configuration, rather it is a probability measure 
defined on the appropriate O, which for fixed n may be thought of as the set 
of all possible configurations of cardinality n. In contrast the distributional 
method treats /x as an infinite configuration. 

A brief history of distributional analysis 
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BiHheJil^raluFe, ihe term distribution is used in the economics litera- 
ture to mean both “configuration” and “distribution function” as defined 
here. If we examine the literature of distributional analyses, we find that 
it is intertwined with analyses giving necessary and/or sufficient conditions 
for domination, local equilibrium, and/or global equilibrium in finite con- 
figurations (to use terminology defined here) [17,15,3,1,23,16]. For instance, 
Plott’s classic paper [17] is titled 

A notion of equilibrium and its possibility [emphasis added] un- 
der majority rule. 

Plott performs no probabilistic analysis but observes (quite rightly) [IBID, 
page 792] that 

it would only be an accident (and a highly improbable one) if 
am equilibrium exists at all. 

Tullock’s analysis [28] is, as Davis et al. [3, page 148] observe, “informally 
developed without theorems or proofs by the device of insightful examples.” 
Later papers such as [15,3,16] meld these analyses by formalizing ideas of 
Tullock [28,29] and simultaneously generalizing Plott’s results to infinite 
populations and/or more general preference functions (also global rather 
than local equilibrium). For instance, Davis et a/.[3, page 148] contrast 
their work with Plott’s since the latter 

allows only a finite number of individuals to be considered. 

Presumably Davis et al. view this “limitation” of Plott’s analysis as un- 
desirable because more insight is needed as to the behavior of large finite 
populations. 

In 1981 however Tullock remarks [30, page 190] that his analysis was 
“not regarded as very reliable any more because McKelvey proved that ma- 
jority voting can reach any part of the issue space.” The analysis Tullock 
refers to ultimately showed (see [22,23,20, e.g.]) that the set of configu- 
rations for which equilibrium exists is measure 0, for d > 3 and also for 
d = 2 and odd n, confirming what Plott had said all along. These pow- 
erful results seem implicitly to invalidate the distributional analyses. Yet, 
this consequence does not even now appear to be fully assimilated in the 
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literature. The only unresolved case was d = 2, n even, (and that was my 
original motive for undertaking this line of research.) 

Where is the flaw in distributional analysis? 

We have seen that some of the distributional analyses suggested im- 
plications at odds with the instability theorems of McKelvey, Schofield, 
Rubenstein, and others[12,13,21,20], So is there a flaw in the distributional 
arguments, and if so what is it? The crucial part is lucidly exposed by 
Arrow in his 1969 paper [l]. Su mm arizing Tullock’s analysis, Arrow writes 
[page 108]: 

He [Tullock] assumes 

(l) that the number of voters is large, so large that we may 
consider them to constitute a continuum. 

This assumption seems innocuous enough. In the mathematics literature, 
passing to the limiting continuous case is a popular technique. The problem 
is that majority rule requires us to evaluate n/2 where n = the number 
of voters, but the value oo/2 is not well-defined. More precisely, if 0 is 
undominated and only one voter is located at 0 then placing two additional 
voters together at any location i^O must make 0 dominated (by the point 
ex for sufficiently small e > 0. But if n is treated as infinite no shifting of 
any finite number of voters changes the analysis, since oo/2 + 1 = oo/2. 

What happens is that a new definition is needed when passing from the 
finite to the infinite case. Let us examine a specific definition from the 
literature. In an article by Davis, DeGroot, and Hinich [3], necessary and 
sufficient conditions are derived for the existence of a dominant point. As 
stated earlier, this analysis, unlike Plott’s, is intended to apply to infinite 
populations. The critical definition of a non-dominance relation R is quoted 
below [3, page 149]. 

Let P* denote the distribution of most preferred points of the 
individuals. Let X be the most preferred point of an individual 
chosen at random from the population, [note P* is referred to 
as an infinite configuration in the previous sentence and as a 
probability function in the next sentence ] Given a (Borel) set 
S C E n , Pr(S) will denote the probability that X £ S under 
the distribution P*. 
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Definition 1: For any points y € E n and z € E n , it is said 
that yRz if Pr(||y — X|| < \\z — X||) > 

The definition of the relation R just given is mathematically unambigu- 
ous and therefore is mathematically correct. The mathematics in the paper 
[3] is of course correct. But there is a problem with the interpretation of 
the mathematical results. In [3] the passage just cited continues with the 
following interpretation: 

In other words, yRz if and only if at least half the population 
either prefers y to z or is indifferent between y and z. 

What does the word “population” mean in the sentence just quoted? If 
we take it to mean the probability measure, then it would be accurate to 
say that 

yRz if and only the measure (mass) of the subset of the popu- 
lation, that either prefers y to z or is indifferent between y and 
2 , is at least 1/2. 

But if the word “population” refers to a finite sample drawn from the 
distribution P*, then the meaning of yRz is given by the following theorem. 
Theorem 2. Suppose a finite number of points are drawn at random 
according to the distribution P*. Then 

yRz if and only if the probability is at least 1/2 that at least 
half the population either prefers y to z or is indifferent between 
y and z. 

Proof: Suppose yRz. If we were to take a finite sample under the distri- 
bution P*, each sample point would with probability at least 1/2 be at 
least as close to y as to z. Then the number of points in the sample at 
least as close to y as to z follows a binomial distribution with “success” 
parameter p > 1/2. From the most elementary properties of the binomial 
distribution p > 1/2 implies the probability is at least l/2 that at least half 
the outcomes are “successes”. Conversely, if the probability is at least 1/2 
that at least half of the Bernoulli trials end in success, it must be that the 
parameter p > 1/2, whence yRz. 
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3.1 The heart of the problem 

We have arrived at the heart of the problem. When going from finite 
to infinite populations, a new definition of the nondominance relation R 
was needed. Succinctly, let A denote “at least half the population either 
prefers y to z or is indifferent between y and z.” Then for any finite 
sample population, yRz means that A occurs with probability 1/2. But the 
interpretation for infinite populations in [3] is, yRz means that A occurs. 

If the purpose of the mathematical analysis of infinite populations in [3] 
is to gain insight into the behavior of large finite populations, then there 
should be a closer correspondence between the meanings of yRz for finite 
samples and for infinite populations. 

The gap between the finite sample (Theorem 2) and the distributional 
(Definition l) methods just discussed is between l/2 and 1. In the earlier 
example of section 2 involving Theorem 1, the gap was (asymptotically) 
between 0 and 1. The larger gap in that example was due to the intersection 
of many events each with probability 1/2. 

4 An unsuccessful case: The Sonnenschein- 
Arrow Theorem 

Let us now examine a specific case of analysis from the literature where 
the predictions of distributional analysis are misleading. In his article, 
Arrow continues by stating a theorem (he attributes to Sonnenschein) that 
generalizes Tullock’s example [l, pages 108-109]: 

For any pair of alternatives x,y, let N(z,y) be the number 
of individuals who prefer x to y. Then let xMy be the state- 
ment N(x, y) > N(y, x) and xMy the statement that N(x, y) > 
N(y,x) — 

Theorem. Suppose that, for each alternative x°, the set of 
alternatives x for which xMx° is closed, and [ suppose j the set 
of alternatives [x] for which xMx° is convex. Then for any 
compact (closed and bounded ) convex set of alternatives S, there 
is (at least) one alternative x in S such that xMy for all y in 
S. 
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Arrow later points out that “the hypotheses of the theorem are obvi- 
ously fulfilled in Tullock’s example.” [IBID, page 110]. This is of course 
correct, but only subject to assumption (l) above. For if we employ the fi- 
nite sample method of this paper, we find the probability converges to 0 that 
the hypotheses of the Sonnenschein- Arrow theorem are fulfilled in Tullock’s 
example. The following theorem states and proves this statement precisely. 

Theorem 3. Under the hypotheses of Tullock’s example or of Theorem 1, 
the probability that the set ^x : xMoj is convex converges to 0 as n — ► oo. 

Proof: It suffices to consider only the more generous assumption of 
Theorem 1. Recall the gap function g(9) used in the proof of Theorem 
1. If (and only if) g(9) > 1 then a strict majority of the points are in 
the halfplane defined by the normal vector Vg with orientation 9. Then for 
sufficiently small e > 0, the point y = evg dominates 0, i.e. y is in the 
set [x : xMO}. Rotate the vector v through an open halfplane, i.e. let 9 
range in the interval [0, 7r). If the gap function g(9) ever exceeds 1, drops 
to 1 (or below), and later exceeds 1 again, the set [x : xMO] will fail to 
be convex (see Figure 3). This is because there will exist distinct values 
0 < 9i < 9 2 < 9 Z < it such that for all sufficiently small £ > 0, (e,#i) and 
(£,# 3) are in the set, but (€,# 2) is not in the set (using (r, 9) notation). If 
the random walk executed by the gap function behaves in this fashion, then 
the set is not convex, and we call the walk “bad”. 

By Lemma 1, the values of the gap function execute an unbiased random 
walk centered around 0. Therefore we may select the orientation of 9 — 0 so 
that the walk has n — 2 steps and starts at 1. By the recurrence properties 
of one dimensional symmetric random walks [6, e.g.], the walk is bad with 
probability 1 as n — ► 00. In fact it will be bad infinitely often, so the set 
[x : xM 0] will have many nonconvexities. This proves the theorem. 

It has previously been observed that the Sonnenschein- Arrow Theorem 
can fail to be applicable. Greenberg [9], in a lovely paper on d-majority 
equilibrium, gives a deterministic example with n — 4 voters in which 
the set [x : xMO] is not convex. At the time it must have seemed that 
examples such Greenberg’s would become less likely as n increased. For 
instance Kramer (ll, page 313] remarks, 

Several authors, . . . have argued that this instability is a “small- 
sample” problem, and that majority equilibria will be more 
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likely when the number of voters is large; examples and results 
supporting this thesis have been exhibited by .... 



Theorem 3 demonstrates that Greenberg’s example represents the rule, 
not the exception. 

5 A successful case: The 64%-rule 

Although the distributional method can mislead, it sometimes gives per- 
fectly accurate predictions of the asymptotic behavior of finite populations. 
An excellent example is found in a recent paper by Caplin and Nalebuf >2]. 
They consider a class of voting procedures, parameterized by 0 < <5 < 1, in 
which the status quo or incumbent can only be defeated or dislodged if more 
than 8 of the population supports the contesting alternative. Caplin and 
Nalebuf first employ the distributional method: they show that if the distri- 
bution function // is concave, then the smallest 8 that guarantees an equilib- 
rium (undefeatable) point, called the Simpson-Cramer min-max majority, is 
1 — (m/(m + l)) m .((2, Theorem 2]). They continue and prove ([2, Theorem 

3] , essentially the same result is apparently found in [5, 2. 4(iii),pp. 151-152, 
5.3 p.164]) that if a finite sample of size n is drawn at random from the con- 
cave distribution p, then the min-max majority of the sample converges to 
the min-max majority of n a.e.. Hence, “the bounds of the paper extend to 
large finite populations drawn from a concave density” [2, page 801]. Thus 
the distributional method is a success in this case. 

One must take some care in applying the bounds to the finite case. 
Consider a uniform population density on an equilateral triangle (see Figure 

4) . The mass of p in the shaded region is 5/9; it follows that the chances 
are close to 50% that more than 5/9 of a random sample will fall in the 
shaded region. But if this occurs, the center will not be a 5/9-majority core 
point, however slightly the sample fraction exceeds 5/9. 

In fact a stronger statement is true: the triangle center is a 5/9-majority 
rule point with (asymptotic) probability no more than 1/8. 

Theorem 4. Let n ideal points be generated independently from the uni- 
form distribution on a regular triangle. Let p n denote the probability that 
the triangle center is a 5/9 majority point. Then limsup n {p„} < l/8. 
Proof: see appendix. 
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Theorem 4 does not negate Theorem 2 of [2] in a substantial way. To 
begin with, there is the possibility that some other point very close to the 
triangle center is undefeated. But more importantly, suppose that for any 
e > 0, a (5/9 + e)-majority rule were employed. Then, by the almost sure 
convergence of Theorem 3 of [2], the probability converges to 1 that the 
triangle center is a majority point. 

One of the beautiful things about Theorem 2 of [2] is the dimension-free 
corollary that 1 — l/e-majority rule will have a core, (which leads to the 
title of the paper). Since for any fixed dimension m, there exists an e > 0 
such that 1 — (m/(m + l)) m + e < 1 — l/e, the analog of Theorem 4 is 
false for the dimension-free corollary. That is, an immediate and very nice 
consequence of Theorem 3 of [2] and its corollary is the following: 

Corollary: Let n points be sampled independently from any concave dis- 
tribution on 3R m . Then the probability converges to 1, as n — +■ oo, that the 
centroid of the distribution is a 1 — 1/e-majority rule point. 

How rapid is the convergence? In the case of a sign-invariant distribu- 
tion in two dimensions, proposition 1 below states we can certainly expect 
an error of order 1 f \fn. 

Proposition 1. Under the conditions of Theorem 1, the largest majority 
that can be mustered against the origin has expected value > ( n+ V'") . 
Proof: From the proof of Theorem 1, the gap function executes a random 
walk around 0. The expected absolute distance from 0 at the end of a 
random walk is y/n/2 [6]. Dividing by the population size n gives the 
result. 

I have not been able to determine rigorous lower bounds in general. If 
the region is triangular instead of circular, the random walk is not sta- 
tionary (in fact it is no longer Markovian), but heuristically we can again 
expect the maximum gap to be on the order of y/n in expected value from 
the largest distributional gap. The convergence theorems cited in the next 
section will tell us that the error levels can be expected not to exceed 

With a committee size of 100 (e.g. U.S. Senate), 1 jy/n is a fairly sub- 
stantial 10%. If we seek an explanation for the stability of 2/3-majority rule 
in a group of this size, therefore, concavity is not quite enough. Concavity 
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together with a limitation to 2 issues (m = 2 dimensions) might suffice. Al- 
ternatively, the extreme cases of triangular or simplicial distributions may 
in reality be quite rare. 

If the population size is 10,000 or more, drawn from a concave density, 
the probability of stability under majority rule appears to be fairly good. 
From Proposition 1 we heuristically may expect that the maximum gap 
will usually not exceed several multiples of the expected value %Jrxj 2, say 
6(-y/n/2) = 3 y/n. At n = 10,000 this gap as a fraction of population is 
3 y/\ 0000/ 10000 « 3%. But however high the policy space dimension m, 
there is always a “cushion” of about 3% between 2/3 and 1 — l/e. On the 
j? oher hand, When the population size is n = 250 or less, equilibrium may be 
unlikely. Tnis is because Proposition 1 suggests that a gap of at least y/n/2 
will occur quite often. We then have \/250/2(250) « 3%, so the cushion is 
not big enough unless additional restrictions are placed on the preferences 
of the voter population. 

Thus the min-max majority results of [2], particularly the dimension- 
free bounds, provide a successful application of distributional analysis to 
large finite populations, though some care must be taken in applying the 
results to smaller committee sizes. 



6 General clues 

Why do the distributional results discussed in section 5 apply to large finite 
populations, while those discussed previously do not? Part of the answer 
has to do with the difference between non-dominance and strict dominance. 
Recall from section 4 that the finite sample meaning of the non-dominance 
relation R does not converge to the meaning in the distributional case. 
In contrast, the strict dominance relation P : yPz iff yRz and not yRz 
does converge. That is, if yPz in the distributional sense, and a random 
sample of n points is taken, then yPz with respect to that finite sample 
with probability converging to 1 as n — ► oo. (This follows immediately from 
the weak law of large numbers and Davis et al.'s observation that u yPz if 
and only if Pr(||y - AT|| < \\z — X||) > 1/2.”) 

This difference is not enough. For example, suppose distribution y 
is uniform in a square centered at y. Then for all z y, yPz in the 
distributional sense. But if a finite sample of size 2n is taken, then by 
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Theorem 1 with probability converging to 1 there will exist z r y such that 

zPy. However, suppose y strictly dominated all z in some compact set 

Z. We might then argue, if p were continuous, that the strict domination 

occured with a minimum gap of some S > 0. If we could then find a way 

to reduce consideration Z to a finite, relatively small (e.g. polynomial in 

n ) number of points, we could establish the desired behavior of the finite 

sample. These ideas are found in the proof of Theorem 3 in [2], where f arj 

Lemma 1 (page 807) provide^ the reduction to a finite number (n + 1) 

of points. Similar ideas are found in [26], where the fundamental basis 

extreme point theorem of linear programming provides the reduction to a 

finite number. 

The preceding suggests that the mathematical tools for the convergence 
of empirical measures may be appropriate to these questions 1 . This turns 
out to be the case. The interested reader should consult chapter 2, “Uni- 
form Convergence of Empirical Measures” of Pollard’s excellent book[l8]. 

A couple of the most pertinent results are cited below (specialized to our 
case and adapted to our terminology): 

Definition. Let n points be drawn at random according to a probability 
measure p on 3R m . The empirical measure p n is that which places mass l/n 
at each of the n points (obviously they need not be distinct.) 

Let C denote a class of sets in 9? m . For any c E C, it follows that p n {c) 
simply equals the fraction of the points which fell in c. The class C of most 
interest to us is the set of all closed and open halfspaces. Accordingly, let 

C = {c : c = [p • x < p°];p € 5R m ,p° € 9?} . (l) 

Also let C + = [£], the set of open halfspaces, and let V = C U C + . The 
uniform convergence theorem of [18] implies that the empirical measure 
converges to p over these classes. 

Theorem 5. Let p be a probability measure on Then 

sup \p n [d) — p(d) | — ► 0 almost surely (2) 

dev 

Proof: this follows from Theorem 14 (page 18), Lemma 15(i,ii)(page 
18), and Lemma 18 (pages 20-21) of [18]. 

1 i am indebted to Bob Foley, Richard McKelvey, and Gideon Weiss for suggesting this 
line of attack 
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This means that even if we consider all half-spaces h, the largest gap 
between the fraction of points falling in the half-space, and the expected 
fraction (jx(/i)), converges to 0. 

To demonstrate the usefulness of Theorem 5, we invoke it to prove the 
convergence of the min-max majority. The first part of Theorem 6 gener- 
alizes Theorem 3 of [2] from bounded continous to arbitrary distributions, 
the second paxt of Theorem 6 is very similar to 2.4 (iii) and Proposition 
10 in [5]. Yet the proof of Theorem 6 is much shorter and simpler. This 
confirms the appropriateness of this line of attack (and the wisdom of my 
colleagues). 

Theorem 6 Let /x be a probability measure on 5R m . Let n points be ran- 
domly independently sampled from /x. Then the min-max majority value of 
the sample, a(/x„) converges to the distributional min-max majority a(/x) 
almost surely. If in addition /x is continuous and possesses unique min-max 
winner point 2 , then the min-max winner of the sample converges a.s. to 
2 . 

Proof: If 2 is an a-majority point with respect to /x then by Theorem 5 it 
will be an a + e-majority point for /x n eventually, for any positive e. Thus 
limsup{a(/x„)} < a(/x). Conversely, for any (3 < a (/x), set 6 = a (/x) — /?. For 
all x € 3? m , there exists a hyperplane h x through x such that a halfspace h+ 
defined by h z has mass /x(/i+) < (3 + 6. Again by Theorem 5, the supremum 
of the fractional discrepancies over all these halfspaces converges to 0 a.s. 
Thus, 

inf |/x n (/i+)| > 0 + <5/2 (3) 

eventually, with probability 1 (a fraction of at least (3 + 6 / 2 can be mustered 
against every point.) Hence liminf„{a(^„)} > a(/x). This proves the first 
part of Theorem 6. 

The proof of the first part has moreover established that z has limiting 
minimal winning supermajority fraction a. It remains to show that no 
points other than z can also be winning with fraction a. Accordingly let 
e > 0 be arbitrary. Let S C 3R m be an enormous ball containing 2 and with 
/x(S) > a, so that eventually with probability 1 no point outside S can 
be an a-majority winner. Let T denote S with the small ball of radius e 
around 2 removed, T = S\B(z,e). By the compactness ofT and continuity 
of n, there exists (3 such that the minmax majority over all x £ T equals 
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/?. By the uniqueness of z, (3 > a. Then by the same argument as led to 
inequality 3, eventually with probability 1 we have: 

inf /z(A+) >/3 — 6/ 2 > a 

Hence eventually no point in T will be an a-majority winner. This com- 
pletes the proof. 

Theorem 6 ensures convergence of a(^ n ) holds for any distribution. 
This is of particular importance for empirical applications, because spatial 
voting data is often discrete. For example, the Senate data in [10] and other 
studies [24] are taken from roll call votes. Similarly, most public opinion 
polls ask yes/no questions or limit answers to integers in a small range 
(e.g. 1-5). In all these cases the real data will be discrete. Even if kernel 
smoothing ([18, pp. 35,42]) were employed the resulting distributions might 
not be continuous. Also notice the following: if two groups of samples were 
taken from Theorem 5 would ensure the convergence of the two empirical 
measures to each other. This matches the scenario described in section 1, 
where information from polls or past voting records is used to predict an 
outcome. 

In general, we consider a function (al) / whose domain is the set of 
probability measures and whose range is the reals. For example, / might 
be an indicator function for the event “0 is undominated”, or /,• might be 
the ith coordinate of the center of mass of the distribution. When / is 
continuous, the uniform convergence of the empirical measure will ensure 
the convergence of to /(/■*)• 

Consider the indicator function just defined. It is not continuous, in 
the following sense: there exists e > 0 such that for all A > 0, there exist 
empirical distributions fi n and / i n satifying 

sup | fj. n (d) - fi n {d ) | < A 

d&D 

but | fifJ-n) — /(An) | > £• (Just take e = .9). Moreover the discontinuity 
occurs just at the distributions of interest, where the fraction on one side of 
a hyperplane is 1/2. From a more general point of view, this explains the 
failure of finite behavior to converge to distributional behavior as discussed 
in sections 3 and 4. 
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The mathematical guideline for convergence is the continuity of the 
functional. Let us attempt to formulate a less technical rule of thumb to 
give a general sense of how to make accurate predictions for finite popu- 
lations based on distributional results: if the event or quantity of interest 
depends on the precise way voters are split among regions, then a conver- 
gence problem is apt to arise; if it relies instead on having a certain fraction 
or more in a region, then the result is apt to apply to the large finite case, 
possibly with the fraction perturbed slightly. 

Let us apply these observations to the yolk radius convergence shown in 
[26]. A hyperplane is median if the two closed halfspaces it defines each con- 
tains at least half the population. The yolk is the smallest ball intersecting 
all median hyperplanes [7,14]. If there is a simple majority rule core point 
the yolk is that point. Under what circumstances can we expect the yolk 
radius to be small? From a distributional point of view 2 , a yolk radius of 0 
corresponds to a nonempty core. Necessary and sufficient conditions for a 
nonempty core, in the distributional sense, are (see [3,15]) that /z be weakly 
centered: every hyperplane through 0 is a median hyperplane. Therefore a 
distributional analysis predicts that weak centeredness would be necessary 
and sufficient for the yolk radius of random samples to converge to 0. 

Our rule of thumb suggests that there may be a problem with the exact 
50:50 split of the weak centeredness condition, but that a (50 + e) : (50 — 
e) splitting condition would be apt to work. It turns out that the true 
necessary and sufficient condition is that /z be strictly centered{ 26]: for every 
hyperplane not passing through 0, the halfspace it defines not containing 
the origin must contain strictly less than half the population. This outcome 
seems well in accord with the guidelines proposed above. 

We can invoke Theorem 5 to prove the sufficiency half of this result 3 , 
though under an additional assumption of continuity of the distribution /z. 
Despite the lessened generality of Theorem 7, the ease and brevity of its 
proof are noteworthy. Theorem 7. Let n points be sampled independently 

from n, a strictly centered continuous distribution on SR" 1 . Then the radius 
of the yolk of the sample converges to 0 a.s. a s n —* oo. 

Proof: see Appendix. 

2 this distributional analysis is due to Richard McKelvey 

3 the essentials of this proof were suggested to me independently by Robert Foley, 
Richard McKelvey, Loren Platzman, and Gideon Weiss. 
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7 Rereading Tullock’s paper on the general 
irrelevance... 

The results in this paper might seem to invalidate claims in Tullock’s orig- 
inal work. A careful reading shows this is not so. Tullock’s original paper, 
“The general irrelevance of the general impossibility theorem” [28], is in my 
opinion an altogether brilliant piece of work, combining important empiri- 
cal evidence (the scarcity of actual cycling or chaos) with abundant creative 
inspiration and exceptional mathematical intuition (as well as dramatic ex- 
position). A careful reading reveals that Tullock is actually discussing finite 
configurations, and only appeals to the infinite configurations as an intu- 
itive aid. For example, after describing a uniform distributional model, 
Tullock writes ([28, page 259]): 

This might be called the perfect geometrical model, in which 
the number of voters whose optima fall in a given area is ex- 
actly proportional to its area. Given that the voters are finite 
in number, small discontinuities would appear. Two areas that 
differ little in size might have the same number of voters; in- 
deed, the smaller might even have more. Cycles are, therefore, 
possible, but they would become less and less important as the 
number of choosing individuals increases. 

Later, Tullock specifically remarks that the probability of cycling should 
increase as the population grows [IBID, page 261]: 

For close to the center, the area which is preferred to A 
would be farther from the center than A. Cycling becomes 
more probable. When we get very close to the center a point 
randomly selected from among those which could get a majority 
over the given point would have a good chance of being farther 
from the center than it is. At this point, however, most voters 
will feel that new proposals are splitting hairs, and the motion 
to adjourn will carry. 
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This intuitive statement is in accord with Theorem 1. Thus Tullock is not 
claiming that cycles won’t usually exist in large populations 4 . Tullock’s 
main point is that they won’t matter. 

One of the arguments Tullock advances to support his point is that 
unless proposals were carefully manipulated, “the voting process would in 
all probability lead to rapid movement toward the center [28, page26l]. 
This argument is actually a loose forerunner of the yolk, the smallest ball 
intersecting all median hyperplanes. (Tullock’s discussion of intersections 
of median lines, pages 261-262, is especially evocative of the yolk.) 

Since that time the yolk has been rigorously established by Ferejohn, 
McKelvey, and Packel [7] and McKelvey [14]. More recently it has been 
proved that the radius of the yolk does converge to 0 a.s. for the distribution 
of Tullock’s example (or any other centered distribution) [26]. Considering 
the length of time by which Tullock’s work preceded the mathematical 
development of the appropriate technical tools, Tullock’s insights seem all 
the more remarkable. 
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9 Appendix: Proof of Theorems 4 and 7 

Proof of Theorem 4: The three lines through the triangle center in Figure 5 
divide the triangle into the six regions labelled a,b,c,d,e, f. For notational 
ease, let the region label also represent the number of sample points falling 
in that region. If the center is to be a 5/9-majority point, then b + c + d < 
5/9, and similarly d + e + / < 5/9; f + a + b < 5/9. These imply our key 
inequalities: b - e < 1/9; c - / < 1/9; d - a < 1/9. That is, 

4 He also argues that “it is possible, by simple majority voting, to reach points at almost 
any portion of the issue space”, an adumbration of the classic chaos theorems of McKelvey 
and Schofield [12,13,21,22] 
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the number of points in each rhombus is no more than n/9 more than 
the number of points in the opposing triangle. Applying the strong law of 
large numbers, the actual number in each region, for large n, will be within 
0(y/n) of its expected value with very high probability (geometrically de- 
creasing chance of failure). We may therefore condition on the partitioning 
among the three rhombus-triangle pairs being close to the expected value 
of n/3 in each of these three paired regions, and the error in our result- 
ing estimate converges to 0. Once we condition on this likely event, the 
three key inequalities become independent. Now approximating the bino- 
mial distribution of parameters ~ n/3, 1/3 with a normal distribution, (by 
the strong law of large numbers), and since n 1 / 2 dominates n 1 / 4 , it follows 
that the probability is asymptotically 1/2 that the gap between rhombus 
and opposing triangle of the three inequalities. (In other words the median 
and mean of the binomial are very close). Therefore, the conditional prob- 
ability that that three key inequalities all hold is asymptotically 1/8. Thus 
p n in the limit is bounded by 1/8. This proves Theorem 4. 

The upper bound of 1/8 in Theorem 4 can be extended easily to l/2 m+1 
for m dimensions. 

I would moreover conjecture that p n — > 0 as n — * oo. 

Theorem 7 5 Let n points be sampled independently from p, a strictly 
centered continuous distribution on 3R m . Then the radius of the yolk of the 
sample converges to 0 a.i. as n — » oo. 

Proof: Following the proof in [26], we show that the largest distance 
from 0 to any median hyperplane converges to 0. Since this distance is an 
upper bound on the yolk radius, the result will follow. 

For any x ^ 0, let denote the halfspace not containing the origin 
defined by the hyperplane normal at x. By strict centeredness p(hj) < 1/2. 
By continuity p(h+) is continuous in x. 

Let 6 > 0 be arbitrary. Clearly the largest vote attained against 0 
by points e or more away from 0 is attained by points e away, or more 
accurately 

sup p(h+) = sup p(h+). 

Ml>< IMI=« 

By compactness of the set the latter supremum is taken over, and continuity, 

5 see the acknowledgment footnote, page 18. 
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the supremum is attained. Thus there exists /? < 1/2 such that for all 
||x|| > e, we have p(h+ ) < (3. 

The halfspaces hf. are contained in the class C. Let the n points be 
sampled from p. Apply Theorem 5 to find that with probability 1, as n 
increases, 

^(K) < < l/2V||i|| > «. 

This implies that there is no median hyperplane at distance e or more 
from 0, whence the result follows. 
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